LSTM-Based Machine Translation for Madurese-Indonesian

نویسندگان

چکیده

Madurese is one of the regional languages in Indonesia, which dominates East Java and Madura Island particular. The use as a daily language has declined significantly due to shift children adolescents, some are caused by sense prestige difficulty learning Madurese. scarcity research or scientific titles that raises also helps reduce literacy language. Our focuses on creating translation machine for Indonesian maintain preserve existence so can be done through digital media. This study latest dataset Madurese-Indonesian using corpus 30,000 Madura-Indonesian sentence pairs from online Bible. scrapped Bible pages organize based bilingual Then manually process text match two languages' scrapping results, normalization, tokenization remove non-printable characters punctuation corpus. To perform neural (NMT), connected RNN encoder with decoder model, while training testing, used sequential model LSTM, BLEU measure was assess accuracy results. SoftMax optimization function Adam Optimizer added settings, including 128 layers adding Dropout layer got average evaluation result BLEU-1 0.798068, BLEU-2 0.680932, BLEU-3 0.623489, BLEU-4 0.523546 five tests conducted. Given differences between Indonesian, this best approach

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Rule-based Machine Translation between Indonesian and Malaysian

We describe the development of a bidirectional rule-based machine translation system between Indonesian and Malaysian (id-ms), two closely related Austronesian languages natively spoken by approximately 35 million people. The system is based on the re-use of free and publicly available resources, such as the Apertium machine translation platform and Wikipedia articles. We also present our appro...

متن کامل

LSTM Neural Reordering Feature for Statistical Machine Translation

Artificial neural networks are powerful models, which have been widely applied into many aspects of machine translation, such as language modeling and translation modeling. Though notable improvements have been made in these areas, the reordering problem still remains a challenge in statistical machine translations. In this paper, we present a novel neural reordering model that directly models ...

متن کامل

An Analysis of Indonesian Language for Interlingual Machine-Translation System

This paper presents BlAS (Bahasa Indonesia Analyzer System), an analysis systemfor lndonesian language suitable for multilingual machine translation system. BIAS is developed with a motivation to contribute to on-going cooperative research project in machine translation between Indonesia andotherAsian countries.In addition,it mayserve tofosterNLPresearchinIndonesia. It startwith an overviewofva...

متن کامل

Handling Indonesian Clitics: A Dataset Comparison for an Indonesian-English Statistical Machine Translation System

In this paper, we study the effect of incorporating morphological information on an Indonesian (id) to English (en) Statistical Machine Translation (SMT) system as part of a preprocessing module. The linguistic phenomenon that is being addressed here is Indonesian cliticized words. The approach is to transform the text by separating the correct clitics from a cliticized word to simplify the wor...

متن کامل

Toward Asian Speech Translation System: Developing Speech Recognition and Machine Translation for Indonesian Language

In this paper, we present a report on the research and development of speech to speech translation system for Asian languages, primarily on the design and implementation of speech recognition and machine translation systems for Indonesia language. As part of the A-STAR project, each participating country will need to develop each component of the full system for the corresponding language. We w...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Journal of Applied Data Sciences

سال: 2023

ISSN: ['2723-6471']

DOI: https://doi.org/10.47738/jads.v4i3.113